Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 230
Filtrar
1.
Mol Ecol ; : e17257, 2023 Dec 27.
Artigo em Inglês | MEDLINE | ID: mdl-38149334

RESUMO

The question of how local adaptation takes place remains a fundamental question in evolutionary biology. The variation of allele frequencies in genes under selection over environmental gradients remains mainly theoretical and its empirical assessment would help understanding how adaptation happens over environmental clines. To bring new insights to this issue we set up a broad framework which aimed to compare the adaptive trajectories over environmental clines in two domesticated mammal species co-distributed in diversified landscapes. We sequenced the genomes of 160 sheep and 161 goats extensively managed along environmental gradients, including temperature, rainfall, seasonality and altitude, to identify genes and biological processes shaping local adaptation. Allele frequencies at putatively adaptive loci were rarely found to vary gradually along environmental gradients, but rather displayed a discontinuous shift at the extremities of environmental clines. Of the 430 candidate adaptive genes identified, only 6 were orthologous between sheep and goats and those responded differently to environmental pressures, suggesting different putative mechanisms involved in local adaptation in these two closely related species. Interestingly, the genomes of the 2 species were impacted differently by the environment, genes related to signatures of selection were most related to altitude, slope and rainfall seasonality for sheep, and summer temperature and spring rainfall for goats. The diversity of candidate adaptive pathways may result from a high number of biological functions involved in the adaptations to multiple eco-climatic gradients, and a differential role of climatic drivers on the two species, despite their co-distribution along the same environmental gradients. This study describes empirical examples of clinal variation in putatively adaptive alleles with different patterns in allele frequency distributions over continuous environmental gradients, thus showing the diversity of genetic responses in adaptive landscapes and opening new horizons for understanding genomics of adaptation in mammalian species and beyond.

2.
bioRxiv ; 2023 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-37986808

RESUMO

Mapping the functional human genome and impact of genetic variants is often limited to European-descendent population samples. To aid in overcoming this limitation, we measured gene expression using RNA sequencing in lymphoblastoid cell lines (LCLs) from 599 individuals from six African populations to identify novel transcripts including those not represented in the hg38 reference genome. We used whole genomes from the 1000 Genomes Project and 164 Maasai individuals to identify 8,881 expression and 6,949 splicing quantitative trait loci (eQTLs/sQTLs), and 2,611 structural variants associated with gene expression (SV-eQTLs). We further profiled chromatin accessibility using ATAC-Seq in a subset of 100 representative individuals, to identity chromatin accessibility quantitative trait loci (caQTLs) and allele-specific chromatin accessibility, and provide predictions for the functional effect of 78.9 million variants on chromatin accessibility. Using this map of eQTLs and caQTLs we fine-mapped GWAS signals for a range of complex diseases. Combined, this work expands global functional genomic data to identify novel transcripts, functional elements and variants, understand population genetic history of molecular quantitative trait loci, and further resolve the genetic basis of multiple human traits and disease.

3.
Genome Biol ; 24(1): 223, 2023 10 05.
Artigo em Inglês | MEDLINE | ID: mdl-37798615

RESUMO

Crop pangenomes made from individual cultivar assemblies promise easy access to conserved genes, but genome content variability and inconsistent identifiers hamper their exploration. To address this, we define pangenes, which summarize a species coding potential and link back to original annotations. The protocol get_pangenes performs whole genome alignments (WGA) to call syntenic gene models based on coordinate overlaps. A benchmark with small and large plant genomes shows that pangenes recapitulate phylogeny-based orthologies and produce complete soft-core gene sets. Moreover, WGAs support lift-over and help confirm gene presence-absence variation. Source code and documentation: https://github.com/Ensembl/plant-scripts .


Assuntos
Genoma de Planta , Software
4.
Nature ; 621(7978): 344-354, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37612512

RESUMO

The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.


Assuntos
Cromossomos Humanos Y , Genômica , Análise de Sequência de DNA , Humanos , Sequência de Bases , Cromossomos Humanos Y/genética , DNA Satélite/genética , Variação Genética/genética , Genética Populacional , Genômica/métodos , Genômica/normas , Heterocromatina/genética , Família Multigênica/genética , Padrões de Referência , Duplicações Segmentares Genômicas/genética , Análise de Sequência de DNA/normas , Sequências de Repetição em Tandem/genética , Telômero/genética
5.
Nat Methods ; 20(8): 1159-1169, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37443337

RESUMO

The detection of circular RNA molecules (circRNAs) is typically based on short-read RNA sequencing data processed using computational tools. Numerous such tools have been developed, but a systematic comparison with orthogonal validation is missing. Here, we set up a circRNA detection tool benchmarking study, in which 16 tools detected more than 315,000 unique circRNAs in three deeply sequenced human cell types. Next, 1,516 predicted circRNAs were validated using three orthogonal methods. Generally, tool-specific precision is high and similar (median of 98.8%, 96.3% and 95.5% for qPCR, RNase R and amplicon sequencing, respectively) whereas the sensitivity and number of predicted circRNAs (ranging from 1,372 to 58,032) are the most significant differentiators. Of note, precision values are lower when evaluating low-abundance circRNAs. We also show that the tools can be used complementarily to increase detection sensitivity. Finally, we offer recommendations for future circRNA detection and validation.


Assuntos
Benchmarking , RNA Circular , Humanos , RNA Circular/genética , RNA/genética , RNA/metabolismo , Análise de Sequência de RNA/métodos
6.
Front Plant Sci ; 14: 1103035, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37521909

RESUMO

The DNA Features pipeline is the analysis pipeline at EMBL-EBI that annotates repeat elements, including transposable elements. With Ensembl's goal to stay at the cutting edge of genome annotation, we proved that this pipeline needed an update. We then created a new analysis that allowed the Ensembl database to store the repeat classification from the PGSB repeat classification (Recat). This new dataset was then fetched using Perl scripts and used to prove that the pipeline modification induced a gain in sensitivity. Finally, we performed a comparative analysis of transposable element distribution in all plant species available, raising new questions about transposable elements in certain branches of the taxonomic tree.

7.
Nucleic Acids Res ; 51(D1): D1053-D1060, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36350643

RESUMO

It is 24 years since the IPD-IMGT/HLA Database, http://www.ebi.ac.uk/ipd/imgt/hla/, was first released, providing the HLA community with a searchable repository of highly curated HLA sequences. The database now contains over 35 000 alleles of the human Major Histocompatibility Complex (MHC) named by the WHO Nomenclature Committee for Factors of the HLA System. This complex contains the most polymorphic genes in the human genome and is now considered hyperpolymorphic. The IPD-IMGT/HLA Database provides a stable and user-friendly repository for this information. Uptake of Next Generation Sequencing technology in recent years has driven an increase in the number of alleles and the length of sequences submitted. As the size of the database has grown the traditional methods of accessing and presenting this data have been challenged, in response, we have developed a suite of tools providing an enhanced user experience to our traditional web-based users while creating new programmatic access for our bioinformatics user base. This suite of tools is powered by the IPD-API, an Application Programming Interface (API), providing scalable and flexible access to the database. The IPD-API provides a stable platform for our future development allowing us to meet the future challenges of the HLA field and needs of the community.


Assuntos
Bases de Dados Genéticas , Antígenos HLA , Humanos , Antígenos HLA/genética , Antígenos de Histocompatibilidade/genética , Complexo Principal de Histocompatibilidade/genética , Software , Alelos
8.
Sci Rep ; 12(1): 20791, 2022 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-36456625

RESUMO

We searched a database of single-gene knockout (KO) mice produced by the International Mouse Phenotyping Consortium (IMPC) to identify candidate ciliopathy genes. We first screened for phenotypes in mouse lines with both ocular and renal or reproductive trait abnormalities. The STRING protein interaction tool was used to identify interactions between known cilia gene products and those encoded by the genes in individual knockout mouse strains in order to generate a list of "candidate ciliopathy genes." From this list, 32 genes encoded proteins predicted to interact with known ciliopathy proteins. Of these, 25 had no previously described roles in ciliary pathobiology. Histological and morphological evidence of phenotypes found in ciliopathies in knockout mouse lines are presented as examples (genes Abi2, Wdr62, Ap4e1, Dync1li1, and Prkab1). Phenotyping data and descriptions generated on IMPC mouse line are useful for mechanistic studies, target discovery, rare disease diagnosis, and preclinical therapeutic development trials. Here we demonstrate the effective use of the IMPC phenotype data to uncover genes with no previous role in ciliary biology, which may be clinically relevant for identification of novel disease genes implicated in ciliopathies.


Assuntos
Ciliopatias , Camundongos , Animais , Camundongos Knockout , Ciliopatias/genética , Técnicas de Inativação de Genes , Cílios/genética , Bases de Dados Factuais , Proteínas do Tecido Nervoso , Proteínas de Ciclo Celular
12.
Nature ; 604(7906): 437-446, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35444317

RESUMO

The human reference genome is the most widely used resource in human genetics and is due for a major update. Its current structure is a linear composite of merged haplotypes from more than 20 people, with a single individual comprising most of the sequence. It contains biases and errors within a framework that does not represent global human genomic variation. A high-quality reference with global representation of common variants, including single-nucleotide variants, structural variants and functional elements, is needed. The Human Pangenome Reference Consortium aims to create a more sophisticated and complete human reference genome with a graph-based, telomere-to-telomere representation of global genomic diversity. Here we leverage innovations in technology, study design and global partnerships with the goal of constructing the highest-possible quality human pangenome reference. Our goal is to improve data representation and streamline analyses to enable routine assembly of complete diploid genomes. With attention to ethical frameworks, the human pangenome reference will contain a more accurate and diverse representation of global genomic variation, improve gene-disease association studies across populations, expand the scope of genomics research to the most repetitive and polymorphic regions of the genome, and serve as the ultimate genetic resource for future biomedical research and precision medicine.


Assuntos
Genoma Humano , Genômica , Genoma Humano/genética , Haplótipos/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNA
13.
Nature ; 604(7905): 310-315, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35388217

RESUMO

Comprehensive genome annotation is essential to understand the impact of clinically relevant variants. However, the absence of a standard for clinical reporting and browser display complicates the process of consistent interpretation and reporting. To address these challenges, Ensembl/GENCODE1 and RefSeq2 launched a joint initiative, the Matched Annotation from NCBI and EMBL-EBI (MANE) collaboration, to converge on human gene and transcript annotation and to jointly define a high-value set of transcripts and corresponding proteins. Here, we describe the MANE transcript sets for use as universal standards for variant reporting and browser display. The MANE Select set identifies a representative transcript for each human protein-coding gene, whereas the MANE Plus Clinical set provides additional transcripts at loci where the Select transcripts alone are not sufficient to report all currently known clinical variants. Each MANE transcript represents an exact match between the exonic sequences of an Ensembl/GENCODE transcript and its counterpart in RefSeq such that the identifiers can be used synonymously. We have now released MANE Select transcripts for 97% of human protein-coding genes, including all American College of Medical Genetics and Genomics Secondary Findings list v3.0 (ref. 3) genes. MANE transcripts are accessible from major genome browsers and key resources. Widespread adoption of these transcript sets will increase the consistency of reporting, facilitate the exchange of data regardless of the annotation source and help to streamline clinical interpretation.


Assuntos
Biologia Computacional , Bases de Dados Genéticas , Genômica , Genoma , Humanos , Disseminação de Informação , Anotação de Sequência Molecular , National Library of Medicine (U.S.) , Estados Unidos
14.
Methods Mol Biol ; 2443: 27-55, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35037199

RESUMO

Ensembl Plants ( http://plants.ensembl.org ) offers genome-scale information for plants, with four releases per year. As of release 47 (April 2020) it features 79 species and includes genome sequence, gene models, and functional annotation. Comparative analyses help reconstruct the evolutionary history of gene families, genomes, and components of polyploid genomes. Some species have gene expression baseline reports or variation across genotypes. While the data can be accessed through the Ensembl genome browser, here we review specifically how our plant genomes can be interrogated programmatically and the data downloaded in bulk. These access routes are generally consistent across Ensembl for other non-plant species, including plant pathogens, pests, and pollinators.


Assuntos
Bases de Dados Genéticas , Genômica , Genoma de Planta , Anotação de Sequência Molecular , Plantas/genética , Software
16.
Proc Natl Acad Sci U S A ; 119(4)2022 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-35042802

RESUMO

A global international initiative, such as the Earth BioGenome Project (EBP), requires both agreement and coordination on standards to ensure that the collective effort generates rapid progress toward its goals. To this end, the EBP initiated five technical standards committees comprising volunteer members from the global genomics scientific community: Sample Collection and Processing, Sequencing and Assembly, Annotation, Analysis, and IT and Informatics. The current versions of the resulting standards documents are available on the EBP website, with the recognition that opportunities, technologies, and challenges may improve or change in the future, requiring flexibility for the EBP to meet its goals. Here, we describe some highlights from the proposed standards, and areas where additional challenges will need to be met.


Assuntos
Sequência de Bases/genética , Eucariotos/genética , Genômica/normas , Animais , Biodiversidade , Genômica/métodos , Humanos , Padrões de Referência , Valores de Referência , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/normas
17.
Fish Res ; 249: 106231, 2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-36798657

RESUMO

The Atlantic herring Clupea harengus L has a vast geographical distribution and a complex population structure with a few very large migratory units and many small local populations. Each population has its own spawning ground and/or time, thereby maintaining their genetic integrity. Several herring populations migrate between common feeding grounds and over-wintering areas resulting in frequent mixing of populations. Thus, many herring fisheries are based on mixed populations of different demographic status. In order to avoid over-exploitation of weak populations and to conserve biodiversity, understanding the population structure and population mixing is important for maintaining biologically sustainable herring fisheries. The aim of this study was to investigate the genetic population structure of herring in the Faroese and surrounding waters, and to develop genetic markers for distinguishing between four herring management units (often called stocks), namely the Norwegian spring-spawning herring (NSSH), Icelandic summer-spawning herring (ISSH), North Sea autumn-spawning herring (NSAH), and Faroese autumn-spawning herring (FASH). Herring from the four stocks were sequenced at low coverage, and single nucleotide polymorphisms (SNPs) were called and used for population structure analysis and individual assignment. An ancestry-informative SNP panel with 118 SNPs was developed and tested on 240 individuals. The results showed that all four stocks appeared to be genetically differentiated populations, but at lower levels of differentiation between FASH and ISSH than the other two populations. Overall assignment rate with the SNP panel was 80.7%, and agreement between the genetic and traditional visual assignment was 75.5%. The NSAH and NSSH samples had the highest assignment rate (100% and 98.3%, respectively) and highest agreement between traditional and genetic assignment methods (96.6% and 94.9%, respectively). The FASH and ISSH samples had substantially lower assignment rates (72.9% and 51.7%, respectively) and agreement between traditional and genetic methods (39.5% and 48.4%, respectively).

18.
Nucleic Acids Res ; 50(D1): D898-D911, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34718728

RESUMO

The Eukaryotic Pathogen, Vector and Host Informatics Resource (VEuPathDB, https://veupathdb.org) represents the 2019 merger of VectorBase with the EuPathDB projects. As a Bioinformatics Resource Center funded by the National Institutes of Health, with additional support from the Welllcome Trust, VEuPathDB supports >500 organisms comprising invertebrate vectors, eukaryotic pathogens (protists and fungi) and relevant free-living or non-pathogenic species or hosts. Designed to empower researchers with access to Omics data and bioinformatic analyses, VEuPathDB projects integrate >1700 pre-analysed datasets (and associated metadata) with advanced search capabilities, visualizations, and analysis tools in a graphic interface. Diverse data types are analysed with standardized workflows including an in-house OrthoMCL algorithm for predicting orthology. Comparisons are easily made across datasets, data types and organisms in this unique data mining platform. A new site-wide search facilitates access for both experienced and novice users. Upgraded infrastructure and workflows support numerous updates to the web interface, tools, searches and strategies, and Galaxy workspace where users can privately analyse their own data. Forthcoming upgrades include cloud-ready application architecture, expanded support for the Galaxy workspace, tools for interrogating host-pathogen interactions, and improved interactions with affiliated databases (ClinEpiDB, MicrobiomeDB) and other scientific resources, and increased interoperability with the Bacterial & Viral BRC.


Assuntos
Bases de Dados Factuais , Vetores de Doenças/classificação , Interações Hospedeiro-Patógeno/genética , Fenótipo , Interface Usuário-Computador , Animais , Apicomplexa/classificação , Apicomplexa/genética , Apicomplexa/patogenicidade , Bactérias/classificação , Bactérias/genética , Bactérias/patogenicidade , Doenças Transmissíveis/microbiologia , Doenças Transmissíveis/parasitologia , Doenças Transmissíveis/patologia , Doenças Transmissíveis/transmissão , Biologia Computacional/métodos , Mineração de Dados/métodos , Diplomonadida/classificação , Diplomonadida/genética , Diplomonadida/patogenicidade , Fungos/classificação , Fungos/genética , Fungos/patogenicidade , Humanos , Insetos/classificação , Insetos/genética , Insetos/patogenicidade , Internet , Nematoides/classificação , Nematoides/genética , Nematoides/patogenicidade , Filogenia , Virulência , Fluxo de Trabalho
19.
Nucleic Acids Res ; 50(D1): D1216-D1220, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34718739

RESUMO

The European Variation Archive (EVA; https://www.ebi.ac.uk/eva/) is a resource for sharing all types of genetic variation data (SNPs, indels, and structural variants) for all species. The EVA was created in 2014 to provide FAIR access to genetic variation data and has since grown to be a primary resource for genomic variants hosting >3 billion records. The EVA and dbSNP have established a compatible global system to assign unique identifiers to all submitted genetic variants. The EVA is active within the Global Alliance of Genomics and Health (GA4GH), maintaining, contributing and implementing standards such as VCF, Refget and Variant Representation Specification (VRS). In this article, we describe the submission and permanent accessioning services along with the different ways the data can be retrieved by the scientific community.


Assuntos
Biologia Computacional , Bases de Dados Genéticas , Variação Genética/genética , Software , Animais , Variação Estrutural do Genoma/genética , Genômica , Humanos , Mutação INDEL/genética , Anotação de Sequência Molecular , Polimorfismo de Nucleotídeo Único/genética
20.
Nucleic Acids Res ; 50(D1): D980-D987, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34791407

RESUMO

The European Genome-phenome Archive (EGA - https://ega-archive.org/) is a resource for long term secure archiving of all types of potentially identifiable genetic, phenotypic, and clinical data resulting from biomedical research projects. Its mission is to foster hosted data reuse, enable reproducibility, and accelerate biomedical and translational research in line with the FAIR principles. Launched in 2008, the EGA has grown quickly, currently archiving over 4,500 studies from nearly one thousand institutions. The EGA operates a distributed data access model in which requests are made to the data controller, not to the EGA, therefore, the submitter keeps control on who has access to the data and under which conditions. Given the size and value of data hosted, the EGA is constantly improving its value chain, that is, how the EGA can contribute to enhancing the value of human health data by facilitating its submission, discovery, access, and distribution, as well as leading the design and implementation of standards and methods necessary to deliver the value chain. The EGA has become a key GA4GH Driver Project, leading multiple development efforts and implementing new standards and tools, and has been appointed as an ELIXIR Core Data Resource.


Assuntos
Confidencialidade/legislação & jurisprudência , Genoma Humano , Disseminação de Informação/métodos , Fenômica/organização & administração , Pesquisa Translacional Biomédica/métodos , Conjuntos de Dados como Assunto , Genótipo , História do Século XX , História do Século XXI , Humanos , Disseminação de Informação/ética , Metadados/ética , Metadados/estatística & dados numéricos , Fenômica/história , Fenótipo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...